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I. Introduction 

A wide use of robotic manipulators is handicapped by their inability to operate in an unstructured 
environment. The existing industrial systems operate by moving along pretaught trajectories, or in the best 
case rely on cl priori CAD world models. Since these motions are ’blind any distortion in robot trajectory 
due to calibration errors can be catastrophic. Besides the obvious inflexibility of such approach, significant 
problems of robot calibration have to be solved in order to exercise these motions [Stone]. 

An integration of 3D vision systems with robot manipulators will allow robots to operate in a poorly 
structured environment by visually locating targets and obstacles. However, by using computer vision for 
objects acquisition makes the problem of overall system calibration even more difficult. Indeed, in a CAD 
based manipulation a control architecture has to find an accurate mapping between the 3D Euclidean work 
space and a robot configuration space (joint angles). If a stereo vision is involved, then one needs to map a 
pair of 2D video images directly into the robot configuration space. Neural Network approach [Kup88] aside, 
a common solution to this problem is to calibrate Vision and Manipulator independently, and then tie them 
via common mapping into the task space. In other words, both Vision and Robot refer to some common 
Absolute Euclidean Coordinate Frame via their individual mappings. 

This approach has two major difficulties. First a Vision System has to be calibrated over the total 
work space. And second, the Absolute Frame, which is usually quite arbitrary, has to be the same with a 
high degree of precision for both Robot and Vision subsystem calibrations. 

This paper describes a work aimed on the use of computer vision to allow robust fine motion ma- 
nipulation in a poorly structured world which is currently in progress at the JPL along with the preliminary 
results, encountered problems, and directions for the future work. 

II. System Setup Description 

The JPL Telerobot Testbed, where the described work has been done, includes several subsystems 
working cooperatively. There are low level mechanization subsystem - Manipulation and Control Mecha- 
nization (MCM), Run Time Control (RTC) subsystem, Operator Control Station (OCS), AI Task Planner, 
and Sensing and Perception (S&P) subsystems. More detailed description of the Telerobot Testbed and 
individual subsystems can be found elsewhere [Matij],[Kan 1-2], [Stone-2]. Robot Vision is provided by the 
S&P subsystem which determines object 3-D positions and orientations by matching a pair of stereo images 
with prestored polyhedral models. A detailed description of the algorithms and special purpose hardware 
used by the S&P is given in [Gennery]. 




The Telerobot Testbed has five video cameras which are available for the Sensing and Perception 
subsystem (S&P). Three cameras are stationary and provide wide angle - F = 12.5 mm - view of the total 
workspace. Two cameras are mounted on a wrist of a Puma-560 manipulator, called Camera Arm, and can 
be used to provide a close view stereo ( F = 25.0 mm). The stationary cameras are used mainly to acquire 
and track large objects such as satellites, and do not provide enough resolution for fine motion manipulation. 
Only the movable close view cameras can be used to provide positional information for the objects in a task 
space. Here we will be concerned only with the Puma arm mounted stereo cameras. Therefore the terms 
"cameras” or ” vision” will assume only a pair of potentially movable cameras and not the three stationary 
wing cameras. 

The movable cameras can be brought close enough to the Task Board to provide a good view of a 
particular object of interest, however the total work volume where an object can be focused in both cameras 
is only about I’xl’xl*. As a result, any realistic Payload Servicing Task requires an ability to move the 
cameras and to perform machine- vision operations from arbitrary locations. 

The Sensing and Perception (S&P) subsystem was designed around stationary cameras which have 
to be calibrated only once. Due to a very time-consuming nature of a camera calibration process the same 
approach cannot be used for the movable cameras. It is absolutely required, that a 2D — ► 3D mapping of the 
cameras be found only once at a particular configuration of the Camera Arm, and then could be recomputed 
for any arbitrary Camera Arm configuration. The following method allows such an alteration. 

III. Extension of the Vision Work Space 

The camera model used by the Sensing and Perception Subsystem is composed of four 3-vectors: 
c, a, A, and v , which are known as the center, axis, horizontal, and vertical vectors, respectively (Fig. 1). 
These vectors describe the relationship between the 3D coordinates of the Telerobot Reference Frame (used 
as Absolute Frame) and the 2D image coordinates of a stationary camera. They are derived from a rather 
extensive and time-consuming calibration process. As long as the camera is not moved from its calibrated 
position then these vectors are sufficient as they are. If, however, the camera needs to be moved, then the 
vectors of the camera model must be altered to reflect the movement. 


For each camera, there is defined an arbitrarily placed camera reference point which is rigidly at- 
tached to the camera. While this additional coordinate frame can be set anywhere, to simplify computations 
this frame was coincided with the wrist origin of the Camera Arm. If this point is located in the Telerobot 
Reference Frame at position p (a 3- vector) and orientation q (a quaternion) before the camera is moved, 
then the camera model (c, a, A, u) can be transformed to a new camera model {d ,a ! , A', v f ) at location (p', q ; ) 
after the move by the following: 

c' = p' + R{c - p) 
a' = Ra 


A' = Rh (*) 

v — Rv 

where 

R = r(q')r T (q) 

and r(g) is a function which transforms a quaternion q into a rotation matrix (Fig. 1). 


However, a practical application of this algorithm leads to a significant mismatch between the real 
and S&P perceived absolute object positions.. The errors are about 2 - 4 cm and result from the combination 
of errors in the Puma Arm nominal kinematics, Vision calibration based on a fictitious absolute frame, and 
measuring errors in the abs positions of the task objects. These errors could be significantly reduced by 
using more detailed model for arm kinematics [Hayati] and by much more elaborate measurements for the 
Task Data Base. However, it is our opinion, that an application of the computer vision in robotics should 
lead to the inherently robust algorithms which provide an ability to work with nominally known systems 
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operated in an unstructured world. To achieve these results, we substituted the whole concept of Vision 
Based Absolute Motion, by a Vision Based Relative Motion. 

The next section describes an approach which allows employment of moving cameras by using 
independent calibration of Vision and Manipulation Systems and without any significant improvements in 
the determination of p and p' absolute coordinates. 

III. Absolute vs. Relative Robot Motions 

Let us compare the accuracy in the positioning of a Robot End Effector (EE) in two cases. In one 
case the Robot is asked to reach a position P given in some Absolute Frame, say Telerobot Reference Frame 
(Fig.2). Alternatively, for a second case we will command the Robot to move its EE to a position Pi which 
is known relative to the EE start location Pq (Fig. 3). 

We will use standard matrix notations to describe object positions and orientations in the standard 
R 3 x SO 3 space [Paul]. Let Z be a transformation from the origin of the absolute frame to the shoulder 
of Puma manipulator, and T° p gives transformation from the shoulder to some point P, say EE or camera 
reference point. For any true value Y we will denote its computed or obtained estimate as Y . 

A. Absolute Motion 

Let us put the EE at absolute location P which is described in the abs frame of reference by a 
transformation T E e f(P). The desired joint angles can be found (and are actually computed in the existing 
Telerobot Testbed) via inverse kinematics of T° E (P), where 


Tgf(P) = ZT° E (P) 


( 1 ) 


therefore 


T 0E (P) = Z~ l T? E {P) (2) 

A Nominal Inverse Kinematics (NIK) algorithm transfers T° E (P) into a set of desired joint angles J it , , which 
are sent to the joint controllers. As a result, the arm moves to some set of joint angles J which is slightly 
different from J it ,. Due to the imperfection of the NIK model and joint controllers the achieved location of 
the EE Pi is different than the desired P. The resulting shoulder-EE transform T° E {P\ ) is different from 
the commanded T° E (P) by some error D^ n (P) (if P and P\ are close then £)**„( P) ~ Dt in (Pi)) 


T° b {P) = f° E {P)D kin (P) 


(3) 


The final position of the EE becomes 


T5,f(Pi) = ZT 0 E (Pi) 


(4) 


and by substituting (3) into (4) we have 


T a E t f(Pi) = Zf° E (P)D kin (P) 


(5) 



By substituting (2) and (1) in (5) we can find an absolute error in the EE positioning 


T at>f(Pi) = ZZ~ 1 Tf e f(P)D kin (P) (6) 

If an estimate of the shoulder transform differs from its true value as 

ZZ 1 = Dz (7) 

then the final position of the EE becomes 

TS,?(Pi) = D z Tf c f(P)D kin (P) (8) 


One can see from (8) that if Tf e f(P) involves large translational motions, which is usually the case, even 
a small rotational error in Z may lead to a significant absolute difference between the desired and actual 
position of the End Effector. 

If Tjfcf is computed from vision or CAD data, it has its own error 


rj-iEE 

1 des 


(P) = Tgf (P) = T E ?(P)D vi ,(P) 


( 9 ) 


and the final difference between the desired position P and achieved Pi becomes 


?S?(A) = D z T E f(P)D vi ,(P)D kin (P) (10) 

The experimental results with the PUMA 560 setup at the Telerobot Testbed showed, that a Puma 
arm with a carefully calibrated NIK can achieve about 5 mm accuracy. After the CAD Data Base was 
updated by touching a number of points on the Task Board with a special tool, an accuracy was improved 
to about 2 mm. 

The use of the Vision System in a single, carefully calibrated configuration leads to about the same 
result, however, if a ’’moving camera model” described in sec. II is used, then the vision perceived object 
positions are about 20 - 40 mm off. Fig. 4 shows a Vision Overlay obtained with the stationary calibrated 
cameras. Fig. 5 shows performance degradation when the described above ’’Moving Cameras” method (eq. 
(*)) had been used. 

B. Relative Motion 

In the previous paragraph we have shown that the major potential source of the robot positioning 
errors is an uncertainty in the robot location relative to the ’’Absolute” coordinate frame. 

Suppose now that we want to make a relative motion of EE from its current position P 0 to Pi on 
some well known transformation A. Then the desired end position is 


Tfef(Pl) = Tab? (Pi) = T a E b ?(P 0 )A 

An estimate of the desired position has to be based on the perceived current location of the EE 
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fS5(Pi) = fZ?(P 0 )A 


( 11 ) 


or from (1), (2), (3) and (11) 


Tgf{Pi) = ZD^ 1 n (Po)T <>E (Po)A = ZD;l t ( p o)T° E (P i) ( 12 ) 


By substituting (12) into (6) we have 


fS*(Pi) = ZZ ~ 1 ZD bi 1 n (P 0 )T OE (Pi )D kin (Pi) = ZD^ n (P 0 )Z- 1 T E b f(P 1 )D kin (P 1 ) (13) 

We see from the (13), that if an arm has a good kinematics model then its relative motion can 
be done very accurately. But the most important feature of the derived relation (13) from the Vision - 
Motion coordination point of view is its independence on Z . It shows that Cameras and Manipulator can be 
calibrated independently relative to their local frames. Namely, one needs to know only the manipulator’s 
inverse kinematics and to be able to determine a true 3D relative position between two objects. 

This property of (13) may be used for fine motion of Robot EE. Say a particular object has to be 
grasped by the EE. Suppose also, that the grasping should start from an approach position which is located 
A relative to the object. In other words the desired absolute position of the EE is 


T a E b f(Pa P r ) = KH A 


(14) 


If the current position of the EE is P then a relative motion B can move it to the P apr , where 


T E b f(P)B = T&A (15) 

Suppose now, that instead of the true abs positions in (15) one is using the vision derived values. 
Then the desired adjustment B is 

B = [T™{P)]- l T£ m (P obj )A (16) 

Let T a b,(P) = T cam (P)D vi ,(P) then from (16) 


B = [D^TZfiP )]- 1 D-\(P oh )T vb ,(Po kj )A (17) 

If P is close to P 0 f,j , which is usually the case, then D v i $ (P) ~ D V i 3 (P t bj) ( since obviously D v i s (X) — D V { S (Y) 
when X = Y and we expect the error to be a smooth function of X — Y) and A is small, eq. (17) gives a 
very good approximation to the relative distance B (vision error is uniformed across the field of view) then 

B = B (18) 


Eq. (18) together with (12) shows, that using Vision Based Relative Motion vision and manipulator cali- 
bration can be separated and performed in the relative sense only. Namely, instead of accuracy of absolute 
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motions only manipulators Inverse and Forward Kinematics - motions in the shoulder frame should be 
calibrated. Similarly, only relative camera calibration (relative to their field of view) is important. Such 
calibration can be done only once with a high precision calibration fixture and then eq. (*) can be used. 

IV. Discussion 

The Vision Based Relative Motion calls for the following possible scenario. Operator moves Cameras 
to position an object of interest in Cameras field of view. Then the object is manually designated (say by using 
an overlay wireframe) and its 3D position and orientation is computed by vision using (*) calibration 
correction. The EE then is positioned (autonomously) at T°*j A which is probably 2 - 3 cm away from the 
desired position T^A. Then the position of the EE is verified by the Vision as T^f and eq. (16) is used. 
After the EE is accurately positioned, grappling of parts mating operation can be done by using compline 
motion, etc. 

The preliminary experiments on the J PL Telerobotics Testbed have shown that the described method 
allows reduction of the EE positioning error from about 2 cm to 1 mm with no noticeable orientation errors. 

The most serious problem now is to design a “Vision - friendly” EE to facilitate fast and accurate 
EE verification by the Vision System. 

In the future research, if vision bandwidth allows, a Vision based motion can be implemented, by 
substituting a single approach point A by a sequence [A i} A 2 , ..., A n ] and by vision assist hopping between 
Ai. 


V. Summary 

The Vision Based Relative Motion together with teleoperation may work as a connecting link between 
autonomous free motions and specific task oriented macros based on the complaint motion, proximity sensors, 
and objects CAD models. A right combination of these technologies will allow us to perform sophisticated 
assembly and servicing operations in an unstructured world by using independently calibrated systems and 
a limited number of robotics primitives. 

A work in this direction is continuing at the JPL Telerobot Testbed. 
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